62 research outputs found

    Indexing of Reading Paths for a Structured Information Retrieval on the Web

    No full text
    International audienceIn this paper, we present a hyperdocument model taking into account the essential aspects of information on the Web: content, composition (logical structure) and nonlinear reading (hypertext structure). We have developed a Structured Information Retrieval System (SIRS) based on this model. Its phases of indexing and querying are based on a “reading paths” point of view of the Web: a Web site is considered as a set of potential reading paths, instead of a set of atomic and flat pages. We have developed an specific algorithm to index the reading paths. We present some experiments aiming at evaluating the interest of our indexing process of reading paths

    BM25t: a BM25 extension for focused information retrieval

    No full text
    25 pagesInternational audienceThis paper addresses the integration of XML tags into a term-weighting function for focused XML Information Retrieval (IR). Our model allows us to consider a certain kind of structural information: tags that represent a logical structure (e.g. title, section, paragraph, etc.) as well as other tags (e.g. bold, italic, center, etc.). We take into account the influence of a tag by estimating the probability for this tag to distinguish relevant terms from the others. Then, these weights are integrated in a term-weighting function. Experiments on a large collection from the INEX 2008 XML IR evaluation campaign showed improvements on focused XML retrieval

    LaHC at CLEF 2015 SBS Lab

    No full text
    International audienceThis paper describes the work of the LaHC lab of Saint-´ Etienne for the Social Book Search lab at CLEF 2015. Our goals were i) to study a field-based retrieval model (BM25F), exploiting various topics and documents fields, in order to build a strong baseline for further experiments, ii) to compare it with a Log logistic (LGD) retrieval model, and iii) to exploit some documents related to each topic (i.e. the documents given as negative or positive examples for a topic). The official results show that LGD outperforms BM25F, and that our approaches exploiting documents related to the topic requesters are based on a different interpretation of this additional information than the interpretation of the Social Book Search organizers

    UJM at INEX 2009 XML Mining Track

    No full text
    8 pagesInternational audienceThis paper reports our experiments carried out for the INEX XML Mining track 2009, consisting in developing categorization methods for multi-labeled XML documents. We represent XML documents as vectors of indexed terms. The purpose of our experiments is twofold: firstly we aim to compare strategies that reduce the index size using an improved feature selection criteria CCD. Secondly, we compare a thresholding strategy (MCut) we proposed with common RCut, PCut strategies. The index size was reduced in such a way that the results were less good than expected. However, we obtained good improvements with the MCut thresholding strategy

    User-Centered Social Information Retrieval Model Exploiting Annotations and Social Relationships

    No full text
    International audienceSocial Information Retrieval (SIR) has extended the classical information retrieval models and systems to take into account social information of the user within his social networks. We assume that a SIR system can exploit the informational social context (ISC) of the user in order to refine his retrieval, since different users may express different information needs as the same query. Hence, we present a SIR model that takes into account the user's social data, such as his annotations and his social relationships through social networks. We propose to integrate the user's ISC into the documents indexing process, allowing the SIR system to personalize the list of documents returned to the user. Our approach has shown interesting results on a test collection built from the social collaborative bookmarking network Delicious

    LaHC at INEX 2014: Social Book Search Track

    Get PDF
    http://ceur-ws.org/Vol-1180/CLEF2014wn-Inex-HafsiEt2014.pdfInternational audienceIn the article, we describe our participation in the INEX 201 4 Social Book Search track. We present the different approaches expl oiting user social information such as reviews, tags and ratings. These social informations are as- signed by users to the books. We optimize our models using the INEX Social Book Search 2013 collection and we test them on the INEX 2014 S ocial Book Search track

    Integrating structure in the probabilistic model for Information Retrieval

    No full text
    International audienceIn databases or in the World Wide Web, many documents are in a structured format (e.g. XML). We propose in this article to extend the classical IR probabilistic model in order to take into account the structure through the weighting of tags. Our approach includes a learning step in which the weight of each tag is computed. This weight estimates the probability that the tag distinguishes the terms which are the most relevant. Our model has been evaluated on a large collection during INEX IR evaluation campaigns

    Integrating user's profile in the query model for Social Information Retrieval

    No full text
    International audienceSocial Information Retrieval (SIR) exploits the user's social data in order to refine the retrieval, for instance in the case where users with different backgrounds may express different information needs as a same textual query. However, this additional source of information is not supported by the classical IR process. In this article, we propose an approach to generate the user profile from his social data. This generated profile is integrated within a SIR model allowing to personalize the list of documents returned to the user

    UJM at INEX 2008: pre impacting of tags weights

    No full text
    International audienceThis paper addresses the integration of tags in terms weighting function for focused XML retrieval. Our model allows to consider a certain kind of structural information: tags that represent logical structure (title, section, etc.) as well as tags related to formatting (bold font, centered text, etc.). We first take into account the tags influence by estimating the probability that tags distinguishes terms which are the most relevant. Then, these weights are impacted on terms weighting function using several combining schemes. Experiments on a large collection during INEX 2008 XML IR evaluation campaign (INitiative for Evaluation of XML Retrieval) showed that using tags leads to improvements on focused retrieval

    Impact de l'information visuelle pour la Recherche d'Images par le contenu et le contexte

    No full text
    15 pagesNational audienceLes documents multimédia composés de texte et d'images sont de plus en plus présents grâce à Internet et à l'augmentation des capacités de stockage. Cet article présente un modèle de représentation de documents multimédia qui combine l'information textuelle et l'information visuelle. En utilisant une approche par sac de mot, un document composé de texte et d'image peut être décrit par des vecteurs correspondant à chaque type d'information. Pour une requête multimédia donnée, une liste de documents pertinents est retournée en combinant linéairement les résultats obtenus séparément sur chaque modalité. Le but de cet article est d'étudier l'impact, sur les résultats, du poids attribué à l'information visuelle par rapport à l'information textuelle. Des expérimentations, réalisées sur la collection multimédia ImageCLEF extraite de l'encyclopédie Wikipedia, montrent que les résultats peuvent être améliorés après une première étape d'apprentissage de ce poids
    • …
    corecore